44 research outputs found

    A Kalman Filter based Low Complexity Throughput Prediction Algorithm for 5G Cellular Networks

    Full text link
    Throughput Prediction is one of the primary preconditions for the uninterrupted operation of several network-aware mobile applications, namely video streaming. Recent works have advocated using Machine Learning (ML) and Deep Learning (DL) for cellular network throughput prediction. In contrast, this work has proposed a low computationally complex simple solution which models the future throughput as a multiple linear regression of several present network parameters and present throughput. It then feeds the variance of prediction error and measurement error, which is inherent in any measurement setup but unaccounted for in existing works, to a Kalman filter-based prediction-correction approach to obtain the optimal estimates of the future throughput. Extensive experiments across seven publicly available 5G throughput datasets for different prediction window lengths have shown that the proposed method outperforms the baseline ML and DL algorithms by delivering more accurate results within a shorter timeframe for inferencing and retraining. Furthermore, in comparison to its ML and DL counterparts, the proposed throughput prediction method is also found to deliver higher QoE to both streaming and live video users when used in conjunction with popular Model Predictive Control (MPC) based adaptive bitrate streaming algorithms.Comment: 13 pages, 14 figure

    Manifold-Preserving Transformers are Effective for Short-Long Range Encoding

    Full text link
    Multi-head self-attention-based Transformers have shown promise in different learning tasks. Albeit these models exhibit significant improvement in understanding short-term and long-term contexts from sequences, encoders of Transformers and their variants fail to preserve layer-wise contextual information. Transformers usually project tokens onto sparse manifolds and fail to preserve mathematical equivalence among the token representations. In this work, we propose TransJect, an encoder model that guarantees a theoretical bound for layer-wise distance preservation between a pair of tokens. We propose a simple alternative to dot-product attention to ensure Lipschitz continuity. This allows TransJect to learn injective mappings to transform token representations to different manifolds with similar topology and preserve Euclidean distance between every pair of tokens in subsequent layers. Evaluations across multiple benchmark short- and long-sequence classification tasks show maximum improvements of 6.8% and 5.9%, respectively, over the variants of Transformers. Additionally, TransJect displays 79% better performance than Transformer on the language modeling task. We further highlight the shortcomings of multi-head self-attention from the statistical physics viewpoint. Although multi-head self-attention was incepted to learn different abstraction levels within the networks, our empirical analyses suggest that different attention heads learn randomly and unorderly. In contrast, TransJect adapts a mixture of experts for regularization; these experts are more orderly and balanced and learn different sparse representations from the input sequences. TransJect exhibits very low entropy and can be efficiently scaled to larger depths.Comment: 17 pages, 7 figures, 5 tables, Findings of the Association for Computational Linguistics: EMNLP202

    Persona-aware Generative Model for Code-mixed Language

    Full text link
    Code-mixing and script-mixing are prevalent across online social networks and multilingual societies. However, a user's preference toward code-mixing depends on the socioeconomic status, demographics of the user, and the local context, which existing generative models mostly ignore while generating code-mixed texts. In this work, we make a pioneering attempt to develop a persona-aware generative model to generate texts resembling real-life code-mixed texts of individuals. We propose a Persona-aware Generative Model for Code-mixed Generation, PARADOX, a novel Transformer-based encoder-decoder model that encodes an utterance conditioned on a user's persona and generates code-mixed texts without monolingual reference data. We propose an alignment module that re-calibrates the generated sequence to resemble real-life code-mixed texts. PARADOX generates code-mixed texts that are semantically more meaningful and linguistically more valid. To evaluate the personification capabilities of PARADOX, we propose four new metrics -- CM BLEU, CM Rouge-1, CM Rouge-L and CM KS. On average, PARADOX achieves 1.6 points better CM BLEU, 47% better perplexity and 32% better semantic coherence than the non-persona-based counterparts.Comment: 4 tables, 4 figure

    Domain adaptation based transfer learning approach for solving PDEs on complex geometries

    Get PDF
    In machine learning, if the training data is independently and identically distributed as the test data then a trained model can make an accurate predictions for new samples of data. Conventional machine learning has a strong dependence on massive amounts of training data which are domain specific to understand their latent patterns. In contrast, Domain adaptation and Transfer learning methods are sub-fields within machine learning that are concerned with solving the inescapable problem of insufficient training data by relaxing the domain dependence hypothesis. In this contribution, this issue has been addressed and by making a novel combination of both the methods we develop a computationally efficient and practical algorithm to solve boundary value problems based on nonlinear partial differential equations. We adopt a meshfree analysis framework to integrate the prevailing geometric modelling techniques based on NURBS and present an enhanced deep collocation approach that also plays an important role in the accuracy of solutions. We start with a brief introduction on how these methods expand upon this framework. We observe an excellent agreement between these methods and have shown that how fine-tuning a pre-trained network to a specialized domain may lead to an outstanding performance compare to the existing ones. As proof of concept, we illustrate the performance of our proposed model on several benchmark problems. © 2022, The Author(s)
    corecore